Usages of Built-in Experiments
===============================

In this page, we will introduce the details of the built-in experiments classes, which include classes of inductive/transductive semi-supervised learning with or without graph. 


The Process of Built-in Experiment Class
----------------------------------------

We first introduce the main process of the experiment, which includes ``load dataset``, ``data manipulate``, ``hyper-parameters selection`` and ``model evaluation``. Before the experiemnts start, you'll be asked to configure the datasets, evaluation metrics, estimators and their candidate parameters. Then, the experiments classes help you finish the whole process without much attention. You should set the estimators, metrics and the datasets to run the experiments.

We provide two built-in experiment classes: ``SslExperimentsWithoutGraph``, `SslExperimentsWithGraph`. The common steps to run the experiment is:

* Initialize an instance of experiments class
* Set the estimators to evaluate with ``append_configs`` method
* Set the datasets with ``append_datasets``
* Set the evaluation metrics with ``set_metric`` and ``append_evaluate_metric`` methods
* Run

Five Steps to Run the Experiments
---------------------------------
Here is an example:

.. code:: python

    from s3l.Experiments import SslExperimentsWithoutGraph
    from s3l.model_uncertainty.S4VM import S4VM

    # list of (name, estimator instance, dict of parameters) 
    configs = [
            ('S4VM', S4VM(), {
                'kernel': 'RBF',
                'gamma':[0],
                'C1': [50,100],
                'C2': [0.05,0.1],
                'sample_time':[100]
            })
        ]

    # list of (name, feature_file, label_file, split_path, graph_file)
    datasets = [
        ('house', None, None, None, None),
        ('isolet', None, None, None, None)
        ]

    # 1. Initialize an object of experiments class
    experiments = SslExperimentsWithoutGraph(transductive=True, n_jobs=4)
    # 2. Set the estimators to evaluate with `append_configs` method
    experiments.append_configs(configs)
    # 3. Set the datasets with `append_datasets`
    experiments.append_datasets(datasets)
    # 4. Set the evaluation metrics with `set_metric`
    experiments.set_metric(performance_metric='accuracy_score')
    # optional. Additional metrics to evaluate the best model.
    experiments.append_evaluate_metric(performance_metric='zero_one_loss')
    experiments.append_evaluate_metric(performance_metric='hamming_loss')
    # 5. Run
    results = experiments.experiments_on_datasets(unlabel_ratio=0.75,test_ratio=0.2,
        number_init=2)


During the initialization, you should first choose a class from     ``SslExperimentsWithoutGraph``, ``SslExperimentsWithGraph`` based on using graph or not. Besides, you should specify the semi-supervised scheme as inductive (set ``transductive``=``False``) or transductive (set ``transductive``=``True``) and deside how many cores of CPU you want to use.

Then, you should call ``append_configs``, ``append_datasets`` and ``set_metric`` in any order to configure the experiments:

* ``append_configs`` takes in a list of tuples like (name, object, parameters_dict). *name* is a string as you like, *object* is the object of an estimator, *parameters_dict* is a dict whose keys are name of parameters for corresponding estimator, values are lists of candidate values. 
* ``append_datasets`` takes in a list of tuples like (name, feature_file, label_file, split_path, graph_file). *name* is a string used for output; *feature_file, label_file, split_spath, graph_file* can be string or NoneType, which should be the absolute path of the file you provided. If you use built-in datasets, *feature_file* and *label_file* can be None; If *split_file* is None, experiments class will split the data every time you run; *graph_file* should be set when the experiment need a graph.
* ``set_metric`` configures the evaluation metric used in ``hyper-parameters selection``. The best model is selected based on this metric. [Here is a list of supported metrics](http). Please note that parameter ``metric_large_better`` indicates whether the metric is larger better.
* ``append_evaluate_metric`` appends other metrics which would be used to evaluate the best model selected in ``hyper-parameters selection``. [Here is a list of supported metrics](http)


Attention
---------

1. In order to reduce repetitive codes, we define some protocols to follow for estimators and metrics. You can refer to :ref:`How to Implement Your Own Estimators:How to Implement Your Own Estimators` for more details.
2. When debugging, you should close parallel mode by setting ``n_jobs`` to 1 else your code won't stop at the breakpoint.
3. If the built-in experiment process doesn't meet your demands, you can design yours own settings (refer to `How to Design Your Own Experiments` and :ref:`s3l/Experiments:Experiments`).